Fix terms aggregation doc_count_error_upper_bound for already reduced results (batched query phase) #134645

benchaplin · 2025-09-12T15:03:36Z

There's logic in AbstractInternalTerms that sets the top-level doc_count_error_upper_bound to 0 if only one aggregation is being reduced. However, now that we have batched query execution, it's possible a reduction of multiple aggregations has already occurred on a data node, and is now undergoing a final reduction on the coordinating node.

I discovered this case while attempting to remove the settings override turning off batched query execution in TermsDocCountErrorIT (testFixedDocs with -Dtests.seed=ABFC03388645940D):

3 shards on a data node, 0 shards on the coordinating node
batched request performs partial reductions on the data node, calculating the expected value of doc_count_error_upper_bound = 46
coordinating node attempts the final reduction, but since its reducing only one agg, sets the doc_count_error_upper_bound to 0

I've attempted a lightweight fix here: flag the existence of a batched query result in the AggregationReduceContext, then check this flag when potentially setting doc_count_error_upper_bound to 0. If a batched query result is present, it implies multiple shards were reduced remotely (a single shard on a data node is not batched).

elasticsearchmachine · 2025-09-12T15:04:02Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

elasticsearchmachine · 2025-09-12T15:04:02Z

Pinging @elastic/es-search-foundations (Team:Search Foundations)

jimczi

The logic makes sense to me, I left one question and some nits regarding how to expose the new flag.

jimczi · 2025-09-18T13:05:41Z

server/src/main/java/org/elasticsearch/search/aggregations/AggregationReduceContext.java


    protected abstract AggregationReduceContext forSubAgg(AggregationBuilder sub);

+    public boolean doesFinalReduceHaveBatchedResult() {


nit: that feels weird since it depends on isFinalReduce? Maybe rename it into hasBatchedResult so that callers have to check isFinalReduce and hasBatchedResult?

jimczi · 2025-09-18T13:06:09Z

server/src/main/java/org/elasticsearch/search/aggregations/AggregationReduceContext.java

+        return finalReduceHasBatchedResult;
+    }
+
+    public void setFinalReduceHasBatchedResult(boolean finalReduceHasBatchedResult) {


This should be final and set in the ForFinal ctr?

jimczi · 2025-09-18T13:08:00Z

.../src/main/java/org/elasticsearch/search/aggregations/bucket/terms/AbstractInternalTerms.java

+                // If we are reducing only one aggregation (size == 1), the doc count error should be 0.
+                // However, the presence of a batched query result implies this is a final reduction and a partial reduction with size > 1
+                // has already occurred on a data node. The doc count error should not be 0 in this case.
+                docCountError = size == 1 && reduceContext.doesFinalReduceHaveBatchedResult() == false ? 0 : sumDocCountError;


Does that also handle the case where the partial reduction happens on the coord node (when reaching reduce batch size)?

Don't set doc count error to 0 when batched reduction occurred

2925777

benchaplin mentioned this pull request Sep 8, 2025

[Meta] Batched Query Phase Follow-up Tasks #125788

Open

6 tasks

jimczi reviewed Sep 18, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix terms aggregation doc_count_error_upper_bound for already reduced results (batched query phase) #134645

Fix terms aggregation doc_count_error_upper_bound for already reduced results (batched query phase) #134645

benchaplin commented Sep 12, 2025

Uh oh!

elasticsearchmachine commented Sep 12, 2025

Uh oh!

elasticsearchmachine commented Sep 12, 2025

Uh oh!

jimczi left a comment

Uh oh!

jimczi Sep 18, 2025

Uh oh!

jimczi Sep 18, 2025

Uh oh!

jimczi Sep 18, 2025

Uh oh!

Uh oh!


		protected abstract AggregationReduceContext forSubAgg(AggregationBuilder sub);

		public boolean doesFinalReduceHaveBatchedResult() {

Fix terms aggregation doc_count_error_upper_bound for already reduced results (batched query phase) #134645

Are you sure you want to change the base?

Fix terms aggregation doc_count_error_upper_bound for already reduced results (batched query phase) #134645

Conversation

benchaplin commented Sep 12, 2025

Uh oh!

elasticsearchmachine commented Sep 12, 2025

Uh oh!

elasticsearchmachine commented Sep 12, 2025

Uh oh!

jimczi left a comment

Choose a reason for hiding this comment

Uh oh!

jimczi Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

jimczi Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

jimczi Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!